Computer vision - Lab 7¶

Agenda¶

Image segmentation based on:

  • thresholding
  • cluster analysis,
  • detecting image features (e.g. edges),
  • region growing,

Helpers¶

In [2]:
# to have popular descriptors like SIFT
!pip install -U opencv-python
!pip install -U opencv-contrib-python
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: opencv-python in /usr/local/lib/python3.7/dist-packages (4.6.0.66)
Requirement already satisfied: numpy>=1.14.5 in /usr/local/lib/python3.7/dist-packages (from opencv-python) (1.21.6)
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: opencv-contrib-python in /usr/local/lib/python3.7/dist-packages (4.6.0.66)
Requirement already satisfied: numpy>=1.14.5 in /usr/local/lib/python3.7/dist-packages (from opencv-contrib-python) (1.21.6)
In [3]:
%matplotlib inline
import cv2
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
import numpy as np
import PIL
from pandas import DataFrame
import pandas as pd
from IPython.display import display, HTML
from skimage.exposure import rescale_intensity
import json
import os
from itertools import product
import itertools
import random
from sklearn.mixture import GaussianMixture
import plotly.graph_objects as go
import sys
import glob

pd.options.display.html.border = 0
pd.options.display.float_format = '{:,.2f}'.format

Images¶

  • Image Lenna (available at the link) - one of the most popular images historically used for testing image processing and compression,
  • clevr -comes from the CLEVR dataset that deals with the Visual Query Answering problem,
  • graf - sample graffiti image from the OpenCV repository OpenCV,
  • sudoku - sample sudoku image from OpenCV repository,
  • skittles - several images containing skittles
In [4]:
# download images
!wget -O lena_std.tif http://www.lenna.org/lena_std.tif
!wget -O clevr.jpg https://cs.stanford.edu/people/jcjohns/clevr/teaser.jpg
!wget -O graf.png https://github.com/opencv/opencv/raw/master/samples/data/graf1.png
!wget -O sudoku.png https://raw.githubusercontent.com/opencv/opencv/master/samples/data/sudoku.png
!wget -O skittles1.jpg https://upload.wikimedia.org/wikipedia/commons/c/ca/Skittles-Louisiana-2003.jpg
!wget -O skittles2.jpg https://hips.hearstapps.com/hmg-prod.s3.amazonaws.com/images/skittles-candy-3-pound-bag-1613163635.jpg
!wget -O skittles3.jpg https://blogs.microsoft.com/wp-content/uploads/prod/sites/140/2021/08/image001-300x225.png
!wget -O skittles4.jpg https://github.com/possibly-wrong/skittles/blob/master/images/145.jpg?raw=true
!wget -O skittles5.jpg https://investorplace.com/wp-content/uploads/2022/07/skittles.jpg
--2022-11-28 18:47:09--  http://www.lenna.org/lena_std.tif
Resolving www.lenna.org (www.lenna.org)... 107.180.37.106
Connecting to www.lenna.org (www.lenna.org)|107.180.37.106|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 786572 (768K) [image/tiff]
Saving to: ‘lena_std.tif’

lena_std.tif        100%[===================>] 768.14K  3.44MB/s    in 0.2s    

2022-11-28 18:47:09 (3.44 MB/s) - ‘lena_std.tif’ saved [786572/786572]

--2022-11-28 18:47:09--  https://cs.stanford.edu/people/jcjohns/clevr/teaser.jpg
Resolving cs.stanford.edu (cs.stanford.edu)... 171.64.64.64
Connecting to cs.stanford.edu (cs.stanford.edu)|171.64.64.64|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 142143 (139K) [image/jpeg]
Saving to: ‘clevr.jpg’

clevr.jpg           100%[===================>] 138.81K   651KB/s    in 0.2s    

2022-11-28 18:47:10 (651 KB/s) - ‘clevr.jpg’ saved [142143/142143]

--2022-11-28 18:47:10--  https://github.com/opencv/opencv/raw/master/samples/data/graf1.png
Resolving github.com (github.com)... 140.82.112.4
Connecting to github.com (github.com)|140.82.112.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/opencv/opencv/master/samples/data/graf1.png [following]
--2022-11-28 18:47:10--  https://raw.githubusercontent.com/opencv/opencv/master/samples/data/graf1.png
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 951440 (929K) [image/png]
Saving to: ‘graf.png’

graf.png            100%[===================>] 929.14K  --.-KB/s    in 0.05s   

2022-11-28 18:47:11 (19.6 MB/s) - ‘graf.png’ saved [951440/951440]

--2022-11-28 18:47:11--  https://raw.githubusercontent.com/opencv/opencv/master/samples/data/sudoku.png
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 250720 (245K) [image/png]
Saving to: ‘sudoku.png’

sudoku.png          100%[===================>] 244.84K  --.-KB/s    in 0.03s   

2022-11-28 18:47:11 (7.68 MB/s) - ‘sudoku.png’ saved [250720/250720]

--2022-11-28 18:47:11--  https://upload.wikimedia.org/wikipedia/commons/c/ca/Skittles-Louisiana-2003.jpg
Resolving upload.wikimedia.org (upload.wikimedia.org)... 208.80.154.240, 2620:0:861:ed1a::2:b
Connecting to upload.wikimedia.org (upload.wikimedia.org)|208.80.154.240|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 785075 (767K) [image/jpeg]
Saving to: ‘skittles1.jpg’

skittles1.jpg       100%[===================>] 766.67K  --.-KB/s    in 0.07s   

2022-11-28 18:47:11 (11.4 MB/s) - ‘skittles1.jpg’ saved [785075/785075]

--2022-11-28 18:47:11--  https://hips.hearstapps.com/hmg-prod.s3.amazonaws.com/images/skittles-candy-3-pound-bag-1613163635.jpg
Resolving hips.hearstapps.com (hips.hearstapps.com)... 151.101.0.155, 151.101.64.155, 151.101.128.155, ...
Connecting to hips.hearstapps.com (hips.hearstapps.com)|151.101.0.155|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 144274 (141K) [image/jpeg]
Saving to: ‘skittles2.jpg’

skittles2.jpg       100%[===================>] 140.89K  --.-KB/s    in 0.02s   

2022-11-28 18:47:12 (6.36 MB/s) - ‘skittles2.jpg’ saved [144274/144274]

--2022-11-28 18:47:12--  https://blogs.microsoft.com/wp-content/uploads/prod/sites/140/2021/08/image001-300x225.png
Resolving blogs.microsoft.com (blogs.microsoft.com)... 141.193.213.21, 141.193.213.20
Connecting to blogs.microsoft.com (blogs.microsoft.com)|141.193.213.21|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 97092 (95K) [image/png]
Saving to: ‘skittles3.jpg’

skittles3.jpg       100%[===================>]  94.82K  --.-KB/s    in 0.01s   

2022-11-28 18:47:12 (6.29 MB/s) - ‘skittles3.jpg’ saved [97092/97092]

--2022-11-28 18:47:12--  https://github.com/possibly-wrong/skittles/blob/master/images/145.jpg?raw=true
Resolving github.com (github.com)... 20.201.28.151
Connecting to github.com (github.com)|20.201.28.151|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github.com/possibly-wrong/skittles/raw/master/images/145.jpg [following]
--2022-11-28 18:47:13--  https://github.com/possibly-wrong/skittles/raw/master/images/145.jpg
Reusing existing connection to github.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/possibly-wrong/skittles/master/images/145.jpg [following]
--2022-11-28 18:47:13--  https://raw.githubusercontent.com/possibly-wrong/skittles/master/images/145.jpg
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 62976 (62K) [image/jpeg]
Saving to: ‘skittles4.jpg’

skittles4.jpg       100%[===================>]  61.50K  --.-KB/s    in 0.01s   

2022-11-28 18:47:14 (5.11 MB/s) - ‘skittles4.jpg’ saved [62976/62976]

--2022-11-28 18:47:14--  https://investorplace.com/wp-content/uploads/2022/07/skittles.jpg
Resolving investorplace.com (investorplace.com)... 104.18.153.96, 104.18.230.99, 2606:4700::6812:9960, ...
Connecting to investorplace.com (investorplace.com)|104.18.153.96|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 102734 (100K) [image/jpeg]
Saving to: ‘skittles5.jpg’

skittles5.jpg       100%[===================>] 100.33K  --.-KB/s    in 0.02s   

2022-11-28 18:47:14 (6.31 MB/s) - ‘skittles5.jpg’ saved [102734/102734]

Visualization¶

In [5]:
def imshow(a):
  a = a.clip(0, 255).astype('uint8')
  if a.ndim == 3:
    if a.shape[2] == 4:
      a = cv2.cvtColor(a, cv2.COLOR_BGRA2RGBA)
    else:
      a = cv2.cvtColor(a, cv2.COLOR_BGR2RGB)
  display(PIL.Image.fromarray(a))
In [6]:
css = """
<style type="text/css">
  table, td, table.dataframe, table.dataframe td { 
    border: 1px solid black;    //border: double;
    border-collapse: collapse;
    border-style: solid;
    border-spacing: 0px;
    background-color: rgb(250,250,250);
    width: 18px;
    height: 18px;
    text-align: center;
    transform: scale(1.0);
    margin: 2px;
    }
</style>
"""

def h(s):
  return display(HTML(css + DataFrame(s).to_html(header=False, index=False)))
In [7]:
def h_color(a, cmap='gray', scale=2):
  s = [a.shape[0] * scale, a.shape[1] * scale]
  plt.figure(figsize=s)
  plt.tick_params(
    axis='both', which='both',
    bottom=False, top=False,
    labelbottom=False, labelleft=False, left=False, right=False
  )
  plt.imshow(a, cmap=cmap)
In [8]:
cmap = ListedColormap([
    'black', 'tomato', 'chocolate', 'darkorange',
    'gold', 'olive', 'green', 'deepskyblue',
    'blueviolet', 'hotpink'
])

def h_grid(grid, scale=1):
  h_color(grid, cmap, scale)
In [9]:
def pix_show(pixels, skip_each=1, height=800, width=800, colors=None):
  pixels = pixels[::skip_each]
  if colors is None: colors = pixels[:, ::-1]
  else: colors = colors[::skip_each]
  b, g, r = pixels[:, 0], pixels[:, 1], pixels[:, 2]
  fig = go.Figure(data=[
    go.Scatter3d(x=b, y=g, z=r, mode='markers', marker={
        'size': 2,
        'color': colors,
        'opacity': 0.7
    })
  ], layout_xaxis_range=[0, 1], layout_yaxis_range=[0, 1])
  scene = {
      'xaxis': dict(title='Blue'),
      'yaxis': dict(title='Green'),
      'zaxis': dict(title='Red')
  }
  fig.update_layout(autosize=False, height=height, width=width, scene=scene, showlegend=True)
  fig.show()

Image segmentation¶

Segmentation through thresholding¶

The easiest way to segment an image is to perform a pixel intensity thresholding. The thresholding operation consists in replacing all intensities above a certain threshold with a certain constant value, and with another value below this threshold.

There is also segmentation with multiple thresholds, which was presented in the first class.

In [10]:
lena = cv2.imread('./lena_std.tif', cv2.IMREAD_COLOR)
imshow(lena)
In [11]:
lena_gray = cv2.cvtColor(lena, cv2.COLOR_BGR2GRAY) 
imshow(lena_gray)
In [12]:
lut = np.array([255] * 100 + [0] * 100 + [255] * 56)
lena_lut = cv2.LUT(lena_gray, lut)
imshow(lena_lut)

The OpenCV library also includes a ready-made implementation of other simple image thresholding approaches. To perform the thresholding operation in OpenCV, the threshold() function should be called, which takes the image, the threshold value, the maximum value and the threshold method that should be used.

The available thresholding methods include:

  • binary - 0 for pixels with intensities below the threshold, the value , for the others maximum value,
  • binary inverted - 0 for pixels with intensities above the threshold, for the remaining pixels maximum value,
  • truncate - threshold for pixels with intensities above the threshold, the others remain unchanged,
  • to zero - 0 for pixels with intensities below the threshold, the others remain unchanged,
  • to zero inverted - 0 for pixels with intensities above the threshold, the others remain unchanged,
In [13]:
_, lena_bin = cv2.threshold(lena_gray, 127, 255, cv2.THRESH_BINARY)
_, lena_bin_inv = cv2.threshold(lena_gray, 127, 255, cv2.THRESH_BINARY_INV)
_, lena_trunc = cv2.threshold(lena_gray, 127, 255, cv2.THRESH_TRUNC)
_, lena_tozero = cv2.threshold(lena_gray, 127, 255, cv2.THRESH_TOZERO)
_, lena_tozero_inv = cv2.threshold(lena_gray, 127, 255, cv2.THRESH_TOZERO_INV)

imshow(np.concatenate([lena_gray, lena_bin, lena_bin_inv], 1))
imshow(np.concatenate([lena_trunc, lena_tozero, lena_tozero_inv], 1))

Sudoku example¶

In [14]:
sudoku = cv2.imread('./sudoku.png', cv2.IMREAD_GRAYSCALE)
imshow(sudoku)
print(sudoku.shape)
(563, 558)
In [15]:
_, sudoku_bin = cv2.threshold(sudoku, 70, 255, cv2.THRESH_BINARY)
imshow(sudoku_bin)
In [16]:
lut = np.array([0] * 50 + [255] * 80 + [0] * 126)
sudoku_lut = cv2.LUT(sudoku, lut)
imshow(sudoku_lut)

OTSU¶

OTSU is an algorithm that adaptively selects the threshold value so that the intensities of both new (binarized) classes have the intrinsic lowest pixel intensity variance (which is equivalent to maximizing the inter-class variance).

The minimized function by OTSU is: $$\sigma^2_w(t) = Q_1\sigma^2_1 + Q_2\sigma^2_2$$ where:

  • $Q_i$ - the probability of a pixel belonging to the i-th class (this is the cumulative distribution function, which results from the formula) $$Q_i = P_i(f(x,y) < t_i)$$
  • $\sigma^2_i$ - variance within i-th class. To find the smallest value, you can count all the values of the expression (from 0 to 255).

The first step will be to calculate the probability of a pixel of a given intensity (from 0 to 255). To find such, it is enough to determine the histogram and normalize it. Additionally, the cumulative distribution function will be calculated for the purpose of calculating the mean and conditional variance.

In [17]:
def probs(img):
  h = cv2.calcHist([img], [0], None, [256], [0, 256])
  pdf = h.ravel() / h.sum()
  cdf = np.cumsum(pdf)
  return pdf, cdf

pdf, cdf = probs(sudoku)

plt.plot(pdf)
plt.show()

plt.plot(cdf)
plt.show()

A single iteration (for a certain set threshold) of the OTSU algorithm consists in dividing the probability distribution into the probabilities of occurrence of pixels of both classes. To determine the mean value, it is necessary to use the formula for the conditional expected value (because we calculate the average of the pixels under the condition that a certain class occurs, or in other words: ''expected value for a given class'').

$$ E(C_1 | t_1) = \sum_{x =1}^{t_1} \frac{xP(x)}{Q_1}$$

Then, to calculate the (conditional!) Variances, we use the previously calculated conditional expected values. $$\sigma^2_1 = \sum_{x =1}^{t_1} \frac{(x - E[C_1|t_1])^2P(x)}{Q_1}$$

In [18]:
def cond_mean(x, p, q):
  return (x * p / q).sum()

def _otsu(x, pdf, cdf, epsilon=1e-6):
  x1, x2 = np.hsplit(np.arange(256), [x])
  p1, p2 = np.hsplit(pdf, [x])
  q1, q2 = cdf[x], cdf[-1] - cdf[x]

  if q1 < epsilon or q2 < epsilon:
      return None
  
  m1, m2 = cond_mean(x1, p1, q1), cond_mean(x2, p2, q2)
  s1, s2 = cond_mean((x1 - m1) ** 2, p1, q1), cond_mean((x2 - m2) ** 2, p2, q2)

  return s1 * q1 + s2 * q2

In the most basic version, the OTSU algorithm, we iterating over all possible divisions (threshold from 1 to 254) and selecting the one for which the previously presented objective function returns the smallest value.

In [19]:
def otsu(img):
  pdf, cdf = probs(img)
  v_min = None
  threshold = 0

  for i in range(1, 256):
    v = _otsu(i, pdf, cdf)
    if v is not None and (v_min is None or v < v_min):
      v_min = v
      threshold = i

  _, img_otsu = cv2.threshold(img, threshold, 255, cv2.THRESH_BINARY)

  return threshold, img_otsu

In order to use the OSTU method in opencv, just add THRESH_OTSU to the thresholding operation.

In [20]:
th_self, sudoku_otsu_self = otsu(sudoku)
th_auto, sudoku_otsu_auto = cv2.threshold(sudoku, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)

print('Found value of thresholding with the OTSU algorithm (custom implementation):', th_self)
print('Found value of thresholding with the OTSU algorithm (OpenCV):', th_auto)

print('\nOTSU (custom implementation):')
imshow(sudoku_otsu_self)
print('\nOTSU (OpenCV):')
imshow(sudoku_otsu_auto)
Found value of thresholding with the OTSU algorithm (custom implementation): 95
Found value of thresholding with the OTSU algorithm (OpenCV): 96.0

OTSU (custom implementation):
OTSU (OpenCV):

Adaptive methods¶

Among the thresholding methods, there are also adaptive methods. These are thresholding methods that adjust the threshold value depending on the image content.

Adaptive thresholding methods often work very well when the input image is divided into smaller areas and the threshold value is adjusted separately for each area. The motivation behind such a mechanism is the fact that in real images the lighting (as well as focus, balance, etc.) is uneven.

In [21]:
lena_ad_mean = cv2.adaptiveThreshold(lena_gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 101, 2)
lena_ad_gauss = cv2.adaptiveThreshold(lena_gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 101, 2)

imshow(np.concatenate([lena_ad_mean, lena_ad_gauss], 1))
In [22]:
sudoku_ad_mean = cv2.adaptiveThreshold(sudoku, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 3, 2)
sudoku_ad_gauss = cv2.adaptiveThreshold(sudoku, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 3, 2)

imshow(np.concatenate([sudoku_ad_mean, sudoku_ad_gauss], 1))

Segmentation of multi-channel images¶

Segmentation of multi-channel images (e.g. RGB) by simple thresholding methods becomes problematic due to the need to define thresholds in N-dimensional space. Therefore, instead of simple thresholding and segmentation of multi-channel images, cluster analysis methods are more often used.

Cluster analysis is about finding clusters of pixels in a certain space (even directly in the intensity space!) And creating a separate pixel class in that space.

For the image below, let's perform a simple pixel intensity analysis.

In [23]:
graf = cv2.imread('./graf.png', cv2.IMREAD_COLOR)
graf = cv2.resize(graf, None, fx=0.75, fy=0.75)
imshow(graf)

The image presented as a list of pixels (BGR) was displayed in 3D space, where the coordinates of a given pixel are its intensity values. Additionally, the pixels have been colored according to their intensities.

In [24]:
graf_pixels = graf.reshape([-1, 3])
pix_show(graf_pixels, 16)

From the analysis of the above visualization, a few conclusions can be drawn:

  • most pixels lie on a straight line between black (0, 0, 0) and white (255, 255, 255),
  • a few clusters for colors can be distinguished:
    • Red,
    • blue,
    • gold / orange,,
    • green,
    • claret,
    • dark purple,

One method of splitting spaces into clusters is Gaussian Mixture, which approximates the distribution of clusters using N Gaussian distributions. It is a method with training parameters, so there is a need for a certain sample of data to which we could fit a mathematical model.

The finished implementation of the algorithm is in the Scikit library.

We will use a pixel list (BGR) as the data to which we will fit the model, and then assign each of them a number of Gaussian systems to which it belongs with the highest probability.

In [25]:
# model initialization and training
model = GaussianMixture(n_components=8)
model.fit(graf_pixels)

# assigning classes to pixels
segments = model.predict(graf_pixels)
print(segments)
[5 5 0 ... 2 2 2]

The next step will be to calculate the average color for each class (segment) and display the pixels again with the colors representing the segmentation.

In [26]:
segments_colors = np.stack([graf_pixels[segments==i].mean(0) for i in range(8)], 0)
colors = np.take(segments_colors, segments, 0)

pix_show(graf_pixels, 16, colors=colors[:, ::-1])

Pixels with mapped classes are the final image segmentation based on cluster analysis. The input image and the segmentation effect are shown below.

In [27]:
segmented = colors.reshape(graf.shape)
imshow(np.concatenate([graf, segmented], 1))

Segmentation by edge detection¶

Segmentation through edge detection is based on the knowledge learned in the previous class as part of detecting key points, corners and edges.

The idea is to divide the image on the basis of edges, and then fill closed areas, assigning subsequent identifiers to subsequent separable areas.

The assignment of identifier areas is presented in the next section.

In [28]:
clevr = cv2.imread('./clevr.jpg', cv2.IMREAD_COLOR)
clevr = cv2.resize(clevr, None, fx=0.5, fy=0.5)
imshow(clevr)
In [29]:
clevr_gray = cv2.cvtColor(clevr, cv2.COLOR_BGR2GRAY)
imshow(clevr_gray)
In [30]:
canny_high, _ = cv2.threshold(clevr_gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
canny_low = 0.5 * canny_high

clevr_canny = cv2.Canny(clevr_gray, canny_low, canny_high, 9)
clevr_canny = cv2.morphologyEx(clevr_canny, cv2.MORPH_CLOSE, kernel = np.ones((3, 3),np.uint8))
imshow(clevr_canny)

Segmentation by region growing¶

Segmentation by region growing consists in iteratingly joining adjacent areas until a certain condition is met. The areas are joined after meeting the uniformity test, while the algorithm is executed until the stop condition is met.

Uniformity test - combining areas and checking a certain condition. An example condition might be: the average difference of pixel intensity in both areas. If it is greater than a certain threshold, then the areas are not uniform and there is no connection.

Stop condition - this condition can be treated as no further merges of areas or as a condition for early stopping the algorithm (e.g. when we want the areas not to be larger than the set limit.

In [31]:
regions = np.zeros(clevr_canny.shape[:2], np.int32)
neighbours = [(-1, 0), (1, 0), (0, -1), (0, 1)]

def find_neighbours(img, y, x):
  c_neighbours = []
  for dy, dx in neighbours:
    ny, nx = y + dy, x + dx

    if ny < 0 or ny >= img.shape[0] or nx < 0 or nx >= img.shape[1]:
      continue

    if regions[ny, nx] > 0:
      continue
    
    if img[ny, nx] == 255:
      continue

    if img[y, x] == img[ny, nx]:
      c_neighbours.append((ny, nx))
    
  return c_neighbours


def grow_region(img, y, x, cls):
  regions[y, x] = cls

  c_neighbours = find_neighbours(img, y, x)
  for ny, nx in c_neighbours:
      regions[ny, nx] = cls

  while len(c_neighbours) > 0:
    new_neighbours = []
    for ny, nx in c_neighbours:
      i_new_neighbours = find_neighbours(img, ny, nx)
      for _ny, _nz in i_new_neighbours:
        regions[_ny, _nz] = cls

      new_neighbours.extend(i_new_neighbours)

    c_neighbours = new_neighbours


i = 1
for y in range(clevr_canny.shape[0]):
  for x in range(clevr_canny.shape[1]):
    if regions[y, x] == 0 and clevr_canny[y, x] == 0:
      grow_region(clevr_canny, y, x, i)
      i += 1

mean_colors = np.stack([np.array([255, 255, 255]) if j == 0 else clevr[regions==j].mean(0) for j in range(i)], 0)
regions_colors = np.take(mean_colors, regions, 0)

imshow(regions_colors)

Tasks¶

Task 1¶

Like the section on multi-channel image segmentation, perform the same pixel intensity cluster analysis for the './skittles1.jpg' image and then segment the image using the K-Means algorithm (available, among others, in the scikit library: sklearn.cluster.KMeans) .

Present the intermediate results:

  • BGR input image
  • BGR pixels in 3D space,
  • segmentation result on BGR pixels in 3D space,
  • segmentation result as a 2D image (BGR)
In [32]:
from sklearn.cluster import KMeans
In [33]:
img = cv2.imread('./skittles1.jpg', cv2.IMREAD_COLOR)
img = cv2.resize(img, None, fx=0.8, fy=0.8)
imshow(img)

skittles_pix = img.reshape([-1, 3])
pix_show(skittles_pix, 16)

model = KMeans(n_clusters=8)
model.fit(skittles_pix)

segments = model.predict(skittles_pix)

segments_colors = np.stack([skittles_pix[segments==i].mean(0) for i in range(8)], 0)
colors = np.take(segments_colors, segments, 0)
pix_show(skittles_pix, 16, colors=colors[:, ::-1])

segmented = colors.reshape(img.shape)

imshow(np.concatenate([img, segmented], 1))

Task 2¶

Using the methods you learned in the previous class, find the number of Skittles in the image './skittles1.jpg'. (it is not necessary to use the solution from task 1) Show intermediate results and describe the processing steps in the comment.

Show the original image with founded individual skittles marked on it.

In [237]:
def invoke_prints(d):
    for k, v in d.items():
        print(f"\n{k} applied:")
        imshow(v)

def count_skittles(img, code=None):
    ## Store original image size
    orig_size = img.shape[:2][::-1]

    ## Deal with small images
    img = cv2.resize(img, (1024, 768), interpolation=cv2.INTER_NEAREST)

    ## Get grayscale image
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    ## Remove noise with opening
    struct5 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (7, 7))
    opening = cv2.morphologyEx(gray, cv2.MORPH_OPEN, struct5, iterations=3)

    ## Blur the image
    blur = cv2.blur(opening, (5, 5))

    ## Apply Canny with binary inverse threshold
    canny_high, _ = cv2.threshold(blur, 127, 255, cv2.THRESH_BINARY_INV)
    canny_low = 0.5 * canny_high
    canny = cv2.Canny(blur, canny_low, canny_high)

    ## Dilate resulting Canny
    struct3 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
    dilated = cv2.dilate(canny, struct3, iterations=5)

    ## Collect contours
    contours, hierarchy = cv2.findContours(dilated, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)

    ## Draw contours
    cv2.drawContours(img, contours, -1, 255, -1)

    if code is not None:
      ## if task2, since showing steps would overload the output of task3
      invoke_prints({"Grayscale": gray, "Opening": opening, "Blur": blur, "Canny": canny, "Dilation": dilated})

    ## Display the result
    print(f"\nAmount of skittles on the image: {len(contours)}")

    ## Return resulting image in its original size
    return cv2.resize(img, orig_size)
In [238]:
imshow(count_skittles(cv2.imread('./skittles1.jpg'), "task2"))
Grayscale applied:
Opening applied:
Blur applied:
Canny applied:
Dilation applied:
Amount of skittles on the image: 59

Task 3¶

  1. Test the solution from task 2 for the remaining skittels images.
  2. Improve the solution so that it works properly for this images.
In [239]:
for file in glob.glob('./skittles*'):
    skittles = cv2.imread(file, cv2.IMREAD_COLOR)
    imshow(np.concatenate([cv2.resize(skittles, None, fx=0.4, fy=0.4), cv2.resize(count_skittles(skittles), None, fx=0.4, fy=0.4)], 1))
Amount of skittles on the image: 44
Amount of skittles on the image: 21
Amount of skittles on the image: 84
Amount of skittles on the image: 58
Amount of skittles on the image: 59